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eral goals, among which we distinguish the following main 
capabilities: 



- - - Abstract 

"Attributing a cyber-operation through the use of mul- 
tiple pieces of technical evidence (i.e., malware reverse- 
^ engineering and source tracking) and conventional intelli- 
O g enre sources (i.e., human or signals intelligence) is a diffi- 
cult problem not only due to the effort required to obtain 
evidence, but the ease with which an adversary can plant 
false evidence. In this paper, we introduce a formal reason- 
ing system called the InCA (Intelligent Cyber Attribution) 
i - ""! framework that is designed to aid an analyst in the attri- 
P— I bution of a cyber-operation even when the available infor- 
^) mation is conflicting and/or uncertain. Our approach com- 
c/3 bines argumentation-based reasoning, logic programming, 
O and probabilistic models to not only attribute an operation 
but also explain to the analyst why the system reaches its 
t— H conclusions. 

>: 

On. 

^ ;1 Introduction 

• An important issue in cyber- warfare is the puzzle of deter- 
mining who was responsible for a given cyber-operation - 
be it an incident of attack, reconnaissance, or information 

| theft. This is known as the "attribution problem" pQ. The 
difficulty of this problem stems not only from the amount of 
• effort required to find forensic clues but also the ease with 
which an attacker can plant false clues to mislead security 
^ personnel. Further, while techniques such as forensics and 

- 'reverse-engineering [2], source tracking [3], honeypots [4], 
and sinkholing [5] are commonly employed to find evidence 
that can lead to attribution, it is unclear how this evidence 
is to be combined and reasoned about. In a military setting, 
such evidence is augmented with normal intelligence collec- 
tion, such as human intelligence (HUMINT), signals intel- 
ligence (SIGINT) and other means - this adds additional 
complications to the task of attributing a given operation. 
Essentially, cyber-attribution is a highly-technical intelli- 
gence analysis problem where an analyst must consider a 
variety of sources, each with its associated level of confi- 
dence, to provide a decision maker (e.g., a military com- 
mander) insight into who conducted a given operation. 

As it is well known that people's ability to conduct intel- 
ligence analysis is limited [5J, and due to the highly tech- 
nical nature of many cyber evidence-gathering techniques, 
an automated reasoning system would be best suited for 
the task. Such a system must be able to accomplish sev- 



1. Reason about evidence in a formal, principled manner, 
i.e., relying on strong mathematical foundations. 

2. Consider evidence for cyber attribution associated with 
some level of probabilistic uncertainty. 

3. Consider logical rules that allow for the system to draw 
conclusions based on certain pieces of evidence and it- 
eratively apply such rules. 

4. Consider pieces of information that may not be com- 
patible with each other, decide which information is 
most relevant, and express why. 

5. Attribute a given cyber-operation based on the above- 
described features and provide the analyst with the 
ability to understand how the system arrived at that 
conclusion. 

In this paper we present the InCA (Intelligent Cyber At- 
tribution) framework, which meets all of the above qualities. 
Our approach relies on several techniques from the artifi- 
cial intelligence community, including argumentation, logic 
programming, and probabilistic reasoning. We first out- 
line the underlying mathematical framework and provide 
examples based on real-world cases of cyber-attribution (cf. 
Section [5]) ; then, in Sections [3] and |3J we formally present 
InCA and attribution queries, respectively. Finally, we dis- 
cuss conclusions and future work in Section [SJ 

2 Two Kinds of Models 

Our approach relies on two separate models of the world. 
The first, called the environmental model (EM) is used 
to describe the background knowledge and is probabilistic in 
nature. The second one, called the analytical model (AM) 
is used to analyze competing hypotheses that can account 
for a given phenomenon (in this case, a cyber-operation). 
The EM must be consistent - this simply means that there 
must exist a probability distribution over the possible states 
of the world that satisfies all of the constraints in the model, 
as well as the axioms of probability theory. On the contrary, 
the AM will allow for contradictory information as the sys- 
tem must have the capability to reason about competing 
explanations for a given cyber-operation. In general, the 



EM 

"Malware X was compiled 
on a system using the 
English language." 
"Malware W and malware X 
were created in a similar 
coding style." 
"Country Y and country Z 
are currently at war." 



"Country Y has a significant 
investment in math-science- 
cngineering (MSE) education." 



AM 

"Malware X was compiled 
on a system in English- 
speaking country Y." 
"Malware W and 
malware X are 
related." 

"Country Y has a motive to 
launch a cyber-attack against 
country Z." 

"Country Y has the capability 
to conduct a cyber-attack." 



Figure 1: Example observations - EM vs. AM. 

EM contains knowledge such as evidence, intelligence re- 
porting, or knowledge about actors, software, and systems. 
The AM, on the other hand, contains ideas the analyst con- 
cludes based on the information in the EM. Figure Q] gives 
some examples of the types of information in the two mod- 
els. Note that an analyst (or automated system) could as- 
sign a probability to statements in the EM column whereas 
statements in the AM column can be true or false depending 
on a certain combination (or several possible combinations) 
of statements from the EM. We now formally describe these 
two models as well as a technique for annotating knowledge 
in the AM with information from the EM - these annota- 
tions specify the conditions under which the various state- 
ments in the AM can potentially be true. 

Before describing the two models in detail, we first in- 
troduce the language used to describe them. Variable and 
constant symbols represent items such as computer systems, 
types of cyber operations, actors (e.g., nation states, hack- 
ing groups), and other technical and/or intelligence infor- 
mation. The set of all variable symbols is denoted with 
V, and the set of all constants is denoted with C. For our 
framework, we shall require two subsets of C, C act and C ops , 
that specify the actors that could conduct cyber-operations 
and the operations themselves, respectively. In the exam- 
ples in this paper, we will use capital letters to represent 
variables (e.g., X,Y,Z). The constants in C ac t and C ops 
that we use in the running example are specified in the fol- 
lowing example. 

Example 2.1 The following (fictitious) actors and cyber- 
operations will be used in our examples: 

Q ac t — {baja, krasnovia, mojave} (1) 
C ops = {worml23} (2) 



The next component in the model is a set of predicate 
symbols. These constructs can accept zero or more vari- 
ables or constants as arguments, and map to either true 
or false. Note that the EM and AM use separate sets of 
predicate symbols - however, they can share variables and 
constants. The sets of predicates for the EM and AM are 
denoted with Pem, P 'am, respectively. In InCA, we require 
P am to include the binary predicate condOp(X,Y), where 
X is an actor and Y is a cyber-operation. Intuitively, this 
means that actor X conducted operation Y . For instance, 



condOp(baja, worml23) is true if baja was responsible for 
cyber-operation worml23 . A sample set of predicate sym- 
bols for the analysis of a cyber attack between two states 
over contention of a particular industry is shown in Figure[2] 
these will be used in examples throughout the paper. 

A construct formed with a predicate and constants as 
arguments is known as a ground atom (we shall often deal 
with ground atoms). The sets of all ground atoms for EM 
and AM are denoted with G em an d Gam , respectively. 

Example 2.2 The following are examples of ground atoms 
over the predicates given in Figured 

G em ■ origlP {mwl23saml , krasnovia), 
mwHint(mwl23saml , krasnovia), 
inLgConf (krasnovia, baja), 
mseTT (krasnovia, 2) 

Gam' evidOf (mojave, worml23), 
motiv(baja, krasnovia), 
expCw(baja), 
tgt (krasnovia, worml23) 

m 

For a given set of ground atoms, a world is a subset of the 
atoms that are considered to be true (ground atoms not in 
the world are false). Hence, there are 2' Gem I possible worlds 
in the EM and 2l G " M l worlds in the AM, denoted with Wem 
and Wam, respectively. 

Clearly, even a moderate number of ground atoms can 
yield an enormous number of worlds to explore. One way 
to reduce the number of worlds is to include integrity con- 
straints, which allow us to eliminate certain worlds from 
consideration - they simply are not possible in the setting 
being modeled. Our principle integrity constraint will be of 
the form: 

oneOfGA') 

where A! is a subset of ground atoms. Intuitively, this says 
that any world where more than one of the atoms from 
set A! appear is invalid. Let \Gem and \Gam be the sets 
of integrity constraints for the EM and AM, respectively, 
and the sets of worlds that conform to these constraints be 
Wem(ICbm), WamCC^m), respectively. 

Atoms can also be combined into formulas using standard 
logical connectives: conjunction (and), disjunction (or), and 
negation (not). These are written using the symbols A, V, 
respectively. We say a world (w) satisfies a formula (/), 
written w j= /, based on the following inductive definition: 

• if / is a single atom, then w (= / iff / G w; 
. if / = -■/' then w h / iff w £ f; 

• if / = /' A /" then w \= f iff w \= f and w \= f"; and 

• if / = /' V /" then w \= f iff w \= /' or w \= f" . 

We use the notation formulaEM, formula am to denote the 
set of all possible (ground) formulas in the EM and AM, 
respectively. Also, note that we use the notation T, _L to 



P EM : origIP(M,X) 


Malware M originated from an IP address belonging to actor X. 


malwInOp(M, O) 


Mai ware M was used in cyber-operation O. 


mwHint(M,X) 


Malware M contained a hint that it was created by actor X. 


compilLang (M ', C) 


Malware M was compiled in a system that used language C . 


nativLang(X, C) 


Language C is the native language of actor X. 


inLgConf(X,X') 


Actors X and X' are in a larger conflict with each other. 


mseTT(X, N) 


There are at least N number of top-tier math-science-engineering universities in country X . 


infGovSys (X, M) 


Systems belonging to actor X were infected with malware M . 


cybCapAge(X, N) 


Actor X has had a cyber- warfare capability for N years or less. 


govL/yoLaoyJL ) 


Actor X has a government cyber-security lab. 


Pam- condOp(X,0) 


Actor X conducted cyber-operation O. 


evidOf{X,0) 


There is evidence that actor X conducted cyber-operation 0. 


motiv(X,X') 


Actor X had a motive to launch a cyber-attack against actor X' . 


isCap(X,0) 


Actor X is capable of conducting cyber-operation 0. 


tgt(X, 0) 


Actor X was the target of cyber-operation O. 


hasMseLnvest(X) 


Actor X has a significant investment in math-science-engineering education. 


expCw(X) 


Actor X has experience in conducting cyber-operations. 



Figure 2: Predicate definitions for the environment and analytical models in the running example. 



represent tautologies (formulas that are true in all worlds) 
and contradictions (formulas that are false in all worlds), 
respectively. 

2.1 Environmental Model 

In this section we describe the first of the two models, 
namely the EM or environmental model. This model is 
largely based on the probabilistic logic of 7 a , which we now 
briefly review. 

First, we define a probabilistic formula that consists of a 
formula / over atoms from G em, a real number p in the 
interval [0, 1], and an error tolerance e S [0,min(p, 1 — p)]. 
A probabilistic formula is written as: / : p ± e. Intuitively, 
this statement is interpreted as "formula / is true with prob- 
ability between p — e and p + e" - note that we make no 
statement about the probability distribution over this inter- 
val. The uncertainty regarding the probability values stems 
from the fact that certain assumptions (such as probabilis- 
tic independence) may not be suitable in the environment 
being modeled. 

Example 2.3 To continue our running example, consider 
the following set Hem- 

fx = govCybLab{baja) : 0.8 ±0.1 
f 2 = cybCapAge(baja, 5) : 0.2 ± 0.1 
/ 3 = mseTT(baja, 2) : 0.8 ±0.1 
fi = mwHint(mwl23saml , mojave) 

A compilLang (worm 1 23 , english) : 0.7 ± 0.2 
f-, = malwln0p(mwl23saml ,worml23) 

A malwareRel(mwl23saml , mwl23sam2) 

A mwHint(mwl23sam2 , mojave) : 0.6 ± 0.1 
fe = inLgConf (baja, krasnovia) 

V -i cooper (baja, krasnovia) : 0.9 ± 0.1 
/V = origlF '(mwl23saml , baja) : 1 ± 0 

Throughout the paper, let n' BM = {/i, f 2 , fs}. ■ 



We now consider a probability distribution Pr over the 
set Wem(ICem)- We say that Pr satisfies probabilis- 
tic formula / : p ± e iff the following holds: p — e < 
£™enw(ic BM ) Pr H < P + £• A set Tl EM of probabilistic 
formulas is called a knowledge base. We say that a prob- 
ability distribution over W E m{^em) satisfies Hem if and 
only if it satisfies all probabilistic formulas in II E m ■ 

It is possible to create probabilistic knowledge bases for 
which there is no satisfying probability distribution. The 
following is a simple example of this: 

condOp (krasnovia, worml23) 

V condOp(baja, worml23) : 0.4 ± 0; 

condOp (krasnovia, worml23) 

A condOp(baja, worml23) : 0.6 ± 0.1. 

Formulas and knowledge bases of this sort are inconsis- 
tent. In this paper, we assume that information is properly 
extracted from a set of historic data and hence consistent; 
(recall that inconsistent information can only be handled in 
the AM, not the EM). A consistent knowledge base could 
also be obtained as a result of curation by experts, such that 
all inconsistencies were removed - see [H [9] for algorithms 
for learning rules of this type. 

The main kind of query that we require for the proba- 
bilistic model is the maximum entailment problem: given 
a knowledge base II^m and a (non-probabilistic) formula 
q, identify p, e such that all valid probability distributions 
Pr that satisfy II em also satisfy q : p ± e, and there does 
not exist p',e' s.t. [p — e,p + e] D [p' — e',p' + e'], where 
all probability distributions Pr that satisfy H E m also sat- 
isfy q : p' ± e'. That is, given q, can we determine the 
probability (with maximum tolerance) of statement q given 
the information in II^m? The approach adopted in [7] to 
solve this problem works as follows. First, we must solve 
the linear program defined next. 

Definition 2.1 (EM-LP-MIN) Given a knowledge base 
Hem and a formula q: 



• create a variable Xi for each u>i S Wem(^em)i 

• for each fj : pj ± 6j G II em , create constraint: 

WiGW E M{IC E M) S.t. Wi\=fj 

• finally, we also have a constraint: 

x i = l - 

The objective is to minimize the function: 

X Xi - 

wi&Wem^Cem) s.t. Wi\=q 

We use the notation EP-LP-MIN(Hem , q) to refer to the 
value of the objective function in the solution to the EM- 
LP-MIN constraints. 

Let I be the result of the process described in Defini- 
tion 12.11 The next step is to solve the linear program a 
second time, but instead maximizing the objective function 
(we shall refer to this as EM-LP-MAX) - let u be the re- 
sult of this operation. In [7], it is shown that e = and 
p — I + e is the solution to the maximum entailment prob- 
lem. We note that although the above linear program has an 
exponential number of variables in the worst case (i.e., no 
integrity constraints), the presence of constraints has the 
potential to greatly reduce this space. Further, there are 
also good heuristics (cf. [8j [10]) that have been shown to 
provide highly accurate approximations with a reduced-size 
linear program. 

Example 2.4 Consider KB IIg M from Examvle \2.3\ and a 

set of ground atoms restricted to those that appear in that 
program. Hence, we have: 



Ml 


= {govCybLab(baja), cybCapAge(baja,5), 




mseTT(baja, 2)} 


W2 


= {govCybLab(baja), cybCapAge(baja,5)} 


W 3 


— {govCybLab(baja), mseTT(baja, 2)} 


W4 


= {cybCapAge(baja, 5), mseTT(baja,2)} 


w 5 


= {cybCapAge(baja,5)} 


w e 


= {govCybLab(baja)} 


W7 


= {mseTT(baja,2)} 


w s 


= 0 



and suppose we wish to compute the probability for formula: 

q = govCybLab(baja) V mseTT(baja, 2). 

For each formula in Hem we have a constraint, and for 
each world above we have a variable. An objective function 
is created based on the worlds that satisfy the query formula 
(here, worlds w\~W4, wq, W7). Hence, EP-LP-MIN(JA' EM ,q) 
can be written as: 

max x\ + X2 + X3 + X4 + xq + X7 w.r.t. : 

0.7 < xi + x 2 + x 3 + x 6 < 0.9 

0.1 < xi + X2 + X4 + £5 < 0.3 

0.8 < xi + X3 + X4 + X? < 1 

X\ + X 2 + Xz + X4 + x 5 + x 6 + x 7 + x$ =1 



We can now solve EP-LP-MAX(H' EM ,q) and 
EP-LP-MIN{n' EM ,q) to get solution 0.9 ± 0.1. ■ 

2.2 Analytical Model 

For the analytical model (AM), we choose a structured argu- 
mentation framework [llj due to several characteristics that 
make such frameworks highly applicable to cyber-warfare 
domains. Unlike the EM, which describes probabilistic in- 
formation about the state of the real world, the AM must 
allow for competing ideas - it must be able to represent 
contradictory information. The algorithmic approach al- 
lows for the creation of arguments based on the AM that 
may "compete" with each other to describe who conducted 
a given cyber-operation. In this competition - known as a 
dialectical process - one argument may defeat another based 
on a comparison criterion that determines the prevailing ar- 
gument. Resulting from this process, the InCA framework 
will determine arguments that are warranted (those that 
are not defeated by other arguments) thereby providing a 
suitable explanation for a given cyber-operation. 

The transparency provided by the system can allow ana- 
lysts to identify potentially incorrect input information and 
fine-tune the models or, alternatively, collect more infor- 
mation. In short, argumentation-based reasoning has been 
studied as a natural way to manage a set of inconsistent in- 
formation - it is the way humans settle disputes. As we will 
see, another desirable characteristic of (structured) argu- 
mentation frameworks is that, once a conclusion is reached, 
we are left with an explanation of how we arrived at it 
and information about why a given argument is warranted; 
this is very important information for analysts to have. In 
this section, we recall some preliminaries of the underly- 
ing argumentation framework used, and then introduce the 
analytical model (AM). 

Defeasible Logic Programming with Presumptions 

DeLP with Presumptions (PreDeLP) [12] is a formalism 
combining Logic Programming with Defeasible Argumen- 
tation. We now briefly recall the basics of PreDeLP; we 
refer the reader to [13j [12] for the complete presentation. 
The formalism contains several different constructs: facts, 
presumptions, strict rules, and defeasible rules. Facts are 
statements about the analysis that can always be consid- 
ered to be true, while presumptions are statements that 
may or may not be true. Strict rules specify logical con- 
sequences of a set of facts or presumptions (similar to an 
implication, though not the same) that must always occur, 
while defeasible rules specify logical consequences that may 
be assumed to be true when no contradicting information 
is present. These constructs are used in the construction 
of arguments, and are part of a PreDeLP program, which 
is a set of facts, strict rules, presumptions, and defeasible 
rules. Formally, we use the notation Ham = (6,fi,$, A) 
to denote a PreDeLP program, where f2 is the set of strict 
rules, 0 is the set of facts, A is the set of defeasible rules, 
and $ is the set of presumptions. In Figure [3] we provide 
an example LT^m ■ We now describe each of these constructs 
in detail. 



O: 6\a — evidOf (baja,worml 23) 

Q\b = evidOf (mojave,worml 23) 
62= motiv(baj a, krasnovia) 



f2 : u>i a — ->condOp(baja, worml23) 

condOp(mojave, worml23) 

uiib = -> condOp (mojave, worml23) <— 
condOp(baja, worml23) 

^>2a — condOpibaja, worml23) <— 
evidOf(baja, worml23), 
isCap{baja, worml23), 
motiv(baja, krasnovia), 
tgt{krasnovia, worml23) 

LJ2b = condOp(mojave, worml23) 

evidOf (mojave, worml23), 
isCap(mojave, worml23), 
motiv(mojave, krasnovia), 
tgt(krasnovia , worml23) 



$ : 4>\ — hasMselnvest(baja) —< 

<p2 = tgt (krasnovia, worml23) 
03 = -iexpCw(baja) — < 



A : Si a — condOp (baja, worml23) — < 
evidOf(baja, worml23) 

Sib = condOp (mojave, worml23) — < 
evidOf (mojave, worml23) 

52 = condOp (baja, worml23) — < 
isCap(baja, worml23) 

83 = condOp (baja, worml23) — < 

motiv(baja, krasnovia), 
tgt (krasnovia, worml23) 

84 = isCap(baja, worml23) — < 

hasMselnvest(baja) 
8§ a — ->isCap(baja, worml23) — < ->expCw(baja) 
85b = ->isCap(mojave, worml23) — < 

-1 exp (mojave ) 



Facts (0) are ground literals representing atomic informa- 
tion or its negation, using strong negation "-1" . Note that 
all of the literals in our framework must be formed with a 
predicate from the set P am- Note that information in this 
form cannot be contradicted. 

Strict Rules (0) represent non-defeasible cause-and-effect 
information that resembles an implication (though the se- 
mantics is different since the contrapositive does not hold) 
and are of the form Lo<— L\, . . . , L n , where Lq is a ground 
literal and {Li};>o is a set of ground literals. 

Presumptions ($) are ground literals of the same form as 
facts, except that they are not taken as being true but rather 
defeasible, which means that they can be contradicted. Pre- 
sumptions are denoted in the same manner as facts, except 
that the symbol — < is added. While any literal can be used 
as a presumption in InCA, we specifically require all literals 
created with the predicate condOp to be defeasible. 

Defeasible Rules (A) represent tentative knowledge that 
can be used if nothing can be posed against it. Just as pre- 
sumptions are the defeasible counterpart of facts, defeasible 
rules are the defeasible counterpart of strict rules. They 
are of the form Lq — ; Li, . . . , L n , where Lq is a ground lit- 
eral and {Li}i>Q is a set of ground literals. Note that with 
both strict and defeasible rules, strong negation is allowed 
in the head of rules, and hence may be used to represent 
contradictory knowledge. 

Even though the above constructs are ground, we allow 
for schematic versions with variables that are used to repre- 
sent sets of ground rules. We denote variables with strings 
starting with an uppercase letter; Figure [4] shows a non- 
ground example. 

When a cyber-operation occurs, InCA must derive ar- 
guments as to who could have potentially conducted the 
action. Derivation follows the same mechanism of Logic 
Programming 13] , Since rule heads can contain strong 
negation, it is possible to defeasibly derive contradictory 
literals from a program. For the treatment of contradictory 
knowledge, PreDeLP incorporates a defeasible argumenta- 
tion formalism that allows the identification of the pieces of 
knowledge that are in conflict, and through the previously 
mentioned dialectical process decides which information pre- 
vails as warranted. 

This dialectical process involves the construction and 
evaluation of arguments that either support or interfere 
with a given query, building a dialectical tree in the pro- 
cess. Formally, we have: 

Definition 2.2 (Argument) An argument (A, L) for a 
literal L is a pair of the literal and a (possibly empty) set 
of the EM (A C Ham ) that provides a minimal proof for L 
meeting the requirements: (1.) L is defeasibly derived from 
A, (2.) ft U 9 U A is not contradictory, and (3.) A is a 
minimal subset of A U $ satisfying 1 and 2, denoted (A, L) . 

Literal L is called the conclusion supported by the argu- 
ment, and A is the support of the argument. An argument 
(B, L) is a subargument of (A, L') iff B C A. An argument 
(A, L) is presumptive iff A fl is not empty. We will also 
use Q(A) = Ann, Q(A) = A n 9, A(^) = A(~)A, and 
<f>(A) = ^n$. 



Figure 3: A ground argumentation framework. 



Figure 4: A non-ground argumentation framework. 



9 : 9i= evidOf(baja, worml23) 
#2= motiv(baj a, krasnovia) 



ft : wi = ^condOp(X, O) <- condOp(X' , O), 

oj 2 = condOp (X,0)<- evidOf (X, O), 
isCap(X, O), motiv(X, X'), 
tgt(X',0),X ^ X' 



$ : 4>\ = hasMselnvest(baja) — < 

<p2 = tgt (krasnovia, worm 123) 
</>3 = ->expCw(baja) —< 



A: £1 = condOp(X, O) -< evidOf (X, O) 

8 2 = condOp (X, O) isCap(X, O) 

8 3 = condOp(X, O) motiv(X, X'), tgt(X', O) 

84 = isCap(X,0) — < hasMselnvest(X) 

8 5 = -nisCap(X,0) -h - exp Cw (X ) 



(.4.1, condOp(baja, worml23)) A\ = {#i a ,<5i a } 

(A2, condOp(baja, worml23)) A2 — {01, </> 2 , #4, w 2a , 

Ola, 62} 

{A3, condOp(baja, worml23)) A3 = {4>i, 62,84} 

(A4, condOp(baja, worml23)) A4 — {02,^3,^2} 

(A5, isCap(baja, worml23)) A5 — {4>\, 84} 

{Ae,^condOp(baja, worml23)) Ae = {<5ib, wia} 

(A7,->isCap(baja, worml23)) A7 = {^3,^50} 



Figure 5: Example ground arguments from Figure G3 

Note that our definition differs slightly from that of [15] 
where DeLP is introduced, as we include strict rules and 
facts as part of the argument. The reason for this will be- 
come clear in Section [3] Arguments for our scenario arc 
shown in the following example. 

Example 2.5 Figure \5\ shows example arguments based on 
the knowledge base from Figure [3J Note that the following 
relationship exists: 

(A5, isCap(baja, worml23)) is a sub-argument of 

(A2, condOp(baja, worml23)) and 

{A3, condOp(baja, worml23)) . m 

Given argument {Ai,L\), counter-arguments are argu- 
ments that contradict it. Argument (^42,-^2) counterargues 
or attacks {Ai,L\) literal L' iff there exists a subargument 
{A,L") of {Ai,Li) s.t. set O(^i) Ufi(^ 2 )U6(^li)ue(^ 2 )U 
{L,2,L"} is contradictory. 

Example 2.6 Consider the arguments from Example \ 2.5l 
The following are some of the attack relationships between 
them: A\, A2, A3, and A4 all attack Ae; A5 attacks A7; 
and Aj attacks A2 ■ ■ 

A proper def eater of an argument {A, L) is a counter- 
argument that - by some criterion - is considered to be 
better than {A,L); if the two are incomparable according 
to this criterion, the counterargument is said to be a block- 
ing defeater. An important characteristic of PreDeLP is 
that the argument comparison criterion is modular, and 
thus the most appropriate criterion for the domain that 
is being represented can be selected; the default criterion 
used in classical defeasible logic programming (from which 
PreDeLP is derived) is generalized specificity [16j . though 
an extension of this criterion is required for arguments us- 
ing presumptions [12] . We briefly recall this criterion next 
- the first definition is for generalized specificity, which is 
subsequently used in the definition of presumption-enabled 
specificity. 

Definition 2.3 Let Ii AM = (9,0,$, A) be a PreDeLP 
program and let T be the set of all literals that have a defea- 
sible derivation from II^m- An argument {A\,L\) is pre- 
ferred to {A2, L2), denoted with A\ )~ps A2 iff the two 
following conditions hold: 

1. For all H C T, £l(Ai)UQ(A2)UH is non- contradictory: 
if there is a derivation for L\ from f2(y^2) U £l(Ai) U 
A(.4i) U H, and there is no derivation for L\ from 
fl(Ai) U 0(^2) U H, then there is a derivation for L2 
from n(Ai) U Q(A 2 ) U A(_4 2 ) U H. 



2. There is at least one set H' C T , fl(Ai) U tt(A 2 ) U 
H' is non- contradictory, such that there is a derivation 
for L 2 from Q(Ai) U fi(.A 2 ) U H' U A(.4 2 ), there is no 
derivation for L2 from f2(.4.i)Uf2(.A 2 )U-ff 7 , and there is 
no derivation for Li from f2(A)UO(^4 2 )Ui/'' 'U A(.4i). 

Intuitively, the principle of specificity says that, in the 
presence of two conflicting lines of argument about a propo- 
sition, the one that uses more of the available information is 
more convincing. A classic example involves a bird, Tweety, 
and arguments stating that it both flies (because it is a 
bird) and doesn't fly (because it is a penguin). The latter 
argument uses more information about Tweety - it is more 
specific - and is thus the stronger of the two. 

Definition 2.4 ([12]) Let U AM = (6,0,$, A) be a Pre- 
DeLP program. An argument {Ai, L±) is preferred to 
(.4.2, £2}; denoted with A\ >~ A2 iff any of the following 
conditions hold: 

1. {Ai,L\) and (.4.2, Z^) ore both factual arguments and 
\A\,Li) ^ps {A 2 ,L 2 ). 

2. {Ai,L±) is a factual argument and (,4 2 ,Z 2 ) is a pre- 
sumptive argument. 

3. {A±, L\) and (^2,^2} are presumptive arguments, and 

(a) -.($(^1) C HA 2 )), or 

(b) $(Ai) = $(.4 2 ) and {Ai,L x ) y PS {A 2 ,L 2 ). 

Generally, if A, B are arguments with rules X and Y , resp., 
and X C Y, then A is stronger than B. This also holds when 
A and B use presumptions P\ and P 2 , resp., and P\ <Z P2. 

Example 2.7 The following are relationships between ar- 
guments from Example \2.,5l based on Definitions \2.3\ 
and \2.4\ 

Ai and Ae are incomparable (blocking def eaters ); 
Aq >- A2, and thus A§ defeats A2; 
Ae y A3, and thus Ae defeats A3; 
Ae >- A4, and thus Ae defeats A4; 

A5 and Aj are incomparable (blocking def eaters). ■ 

A sequence of arguments called an argumentation line 
thus arises from this attack relation, where each argument 
defeats its predecessor. To avoid undesirable sequences, 
that may represent circular or fallacious argumentation 
lines, in DeLP an argumentation line is acceptable if it sat- 
isfies certain constraints (see [13]). A literal L is warranted 
if there exists a non-defeated argument A supporting L. 

Clearly, there can be more than one defeater for a par- 
ticular argument {A,L). Therefore, many acceptable argu- 
mentation lines could arise from {A, L) , leading to a tree 
structure. The tree is built from the set of all argumenta- 
tion lines rooted in the initial argument. In a dialectical 
tree, every node (except the root) represents a defeater of 
its parent, and leaves correspond to undefeated arguments. 
Each path from the root to a leaf corresponds to a different 
acceptable argumentation line. A dialectical tree provides 
a structure for considering all the possible acceptable argu- 
mentation lines that can be generated for deciding whether 



af{9i) = 


origlP (worml23 , baja)V 




(malwInOp(worml23 ,o)A 




(mwHint(worml23 , baja)\/ 




(compilLang (worm 123 , c)A 




nativLang( baja , c) ) J ) 


af(0 2 ) = 


inLgCon} '(baja, krasnovia) 


af(ui) = 


True 


0/(^2) = 


True 


a/(0i) = 


mseTT(baja, 2) V govCybLab(baja) 


af (02 ) = 


malwInOp(worml23 , o')A 




infGovSys (krasnovia, worml23) 


a/ (0 3 ) = 


cyb Cap Age ( 6aja , 5) 


af(S 1 ) = 


True 


af(S 2 ) = 


True 


af{5 3 ) = 


True 


af(S 4 ) = 


True 


af(8 5 ) = 


True 



an argument is defeated. We call this tree dialectical be- 
cause it represents an exhaustive dialectical) analysis for 
the argument in its root. For argument (A, L), we denote 
its dialectical tree with T((A, L)). 

Given a literal L and an argument (A, L), in order to de- 
cide whether or not a literal L is warranted, every node in 
the dialectical tree T((A,L)) is recursively marked as "D" 
(defeated) or "U" (undefeated) , obtaining a marked dialec- 
tical tree T*((A,L)) where: 

• All leaves in T*((A, L)) are marked as "U"s, and 

• Let (B, q) be an inner node of T*((A, L)). Then, (B, q) 
will be marked as "U" iff every child of (B, q) is marked 
as "D". Node (B, q) will be marked as "D" iff it has at 
least a child marked as "U" . 

Given argument {A, L) over Ham, if the root of T*((A, L)) 
is marked "U", then T*({A,h)) warrants L and that L is 
warranted from Ham ■ (Warranted arguments correspond to 
those in the grounded extension of a Dung argumentation 
system [I?].) 

We can then extend the idea of a dialectical tree to a 
dialectical forest. For a given literal L, a dialectical forest 
J-(L) consists of the set of dialectical trees for all arguments 
for L. We shall denote a marked dialectical forest, the set of 
all marked dialectical trees for arguments for L, as T*(L). 
Hence, for a literal L, we say it is warranted if there is at 
least one argument for that literal in the dialectical forest 
J-*(L) that is labeled "U", not warranted if there is at least 
one argument for literal -^L in the forest F*(-^L) that is 
labeled "U", and undecided otherwise. 

3 The InCA Framework 

Having defined our environmental and analytical models 
(n em , Ham respectively) , we now define how the two re- 
late, which allows us to complete the definition of our InCA 
framework. 

The key intuition here is that given a Ham, every ele- 
ment off2U0UAU<I> might only hold in certain worlds 
in the set Wem ~ that is, worlds specified by the environ- 
ment model. As formulas over the environmental atoms 
in set Gem specify subsets of Wem (i-e., the worlds that 
satisfy them), we can use these formulas to identify the 
conditions under which a component ofSlU6UAU<f> can 
be true. Recall that we use the notation formulaEM to 
denote the set of all possible formulas over Gem- There- 
fore, it makes sense to associate elements ofOU9UAU$ 
with a formula from formulaEM- In doing so, we can in 
turn compute the probabilities of subsets off2U0UAU<I> 
using the information contained in n em , which we shall de- 
scribe shortly. We first introduce the notion of annotation 
function, which associates elements offiU0UAU < f> with 
elements of formulaEM- 

We also note that, by using the annotation function (see 
Figure [6]), we may have certain statements that appear as 
both facts and presumptions (likewise for strict and defea- 
sible rules). However, these constructs would have differ- 
ent annotations, and thus be applicable in different worlds. 

1 ln the sense of providing reasons for and against a position. 



Figure 6: Example annotation function. 

Suppose we added the following presumptions to our run- 
ning example: 

03 = evidOf(X, O) — < , and 

04 = motiv(X, X') — < . 

Note that these presumptions are constructed using the 
same formulas as facts 61,62- Suppose we extend af as 
follows: 

a/ (03) = malwInOp(M,0) AmalwareRel(M,M') 

AmwHint(M' ,X) 
a/(0 4 ) = mLgConf(Y,X') A cooper (X,Y) 

So, for instance, unlike 6\, 03 can potentially be true in any 
world of the form: 

{malw!nOp(M, O), malwareRel(M, Af'), mwHint(M' , X)} 

while 6\ cannot be considered in any those worlds. 

With the annotation function, we now have all the com- 
ponents to formally define an InCA framework. 

Definition 3.1 (InCA Framework) Given environmen- 
tal model Hem, analytical model Ham, and annotation 
function af , T — (Hem, Ham, af) is an InCA frame- 
work. 

Given the setup described above, we consider a world- 
based approach - the defeat relationship among arguments 
will depend on the current state of the world (based on the 
EM). Hence, we now define the status of an argument with 
respect to a given world. 

Definition 3.2 (Validity) Given InCA framework 

1= (Hem, Ham, af), argument (A, L) is valid w.r.t. world 

w £ Wem iffVceA,w\= af(c). 

In other words, an argument is valid with respect to w 
if the rules, facts, and presumptions in that argument are 



present in w — the argument can then be built from infor- 
mation that is available in that world. In this paper, we 
extend the notion of validity to argumentation lines, dialec- 
tical trees, and dialectical forests in the expected way (an 
argumentation line is valid w.r.t. w iff all arguments that 
comprise that line are valid w.r.t. w). 

Example 3.1 Consider worlds wi,...,ws from Exam- 
ple ^. 4\ along with the argument (A5, isCap(baja, worml23)) 
from Example \2.5[ This argument is valid in worlds w\^W4, 
We, and wy. ■ 

We now extend the idea of a dialectical tree w.r.t. 
worlds - so, for a given world w G Wem, the dialectical 
(resp., marked dialectical) tree induced by w is denoted 
by T W (A,L) (resp., T*(A,L)). We require that all argu- 
ments and defeaters in these trees to be valid with respect 
to w. Likewise, we extend the notion of dialectical forests in 
the same manner (denoted with J- W (L) and J-^(L), respec- 
tively). Based on these concepts we introduce the notion 
of warranting scenario. 

Definition 3.3 (Warranting Scenario) Let I = (Hem, 
Ham, af) be an InCA framework and L be a ground literal 
over Gam! a, world w G Wem is said to be a warranting 
scenario for L (denoted w h war L) iff there is a dialectical 
forest J-^(L) in which L is warranted and J-^(L) is valid 
w.r.t w. 



and 



Example 3.2 Following from Example \3.1l argument 
(A§, isCap(baja, worml23)) is warranted in worlds wa, wq, 
and W7. ■ 

Hence, the set of worlds in the EM where a literal L in the 
AM must be true is exactly the set of warranting scenarios 
- these are the "necessary" worlds, denoted: 

nec(L) = {w G Wem \ (w h war £,).} 

Now, the set of worlds in the EM where AM literal L can 
be true is the following - these are the "possible" worlds, 
denoted: 

poss(L) = {w G Wem I w l/ war ->L}. 
The following example illustrates these concepts. 
Example 3.3 Following from Example \3.1\ 

nec(isCap(baja, worml23)) = {w^,wq,w^ and 

poss(isCap(baja, worml23)) — {w\, W2, W3, 104, We,w^} ■ 

■ 

Hence, for a given InCA framework X, if we are given 
a probability distribution Pr over the worlds in the EM, 
then we can compute an upper and lower bound on the 
probability of literal L (denoted Ph,Pr,x) as follows: 

ih.Pr.X = Pr ( W )' 

wGnec(L) 

u L ,p r ,x = Y Pr(w), 

w£poss(L) 



l L.Pr.X 



< P L,Pr,I < U L p r ,X- 



Now let us consider the computation of probability 
bounds on a literal when we are given a knowledge base 
Hem in the environmental model, which is specified in I, 
instead of a probability distribution over all worlds. For a 
given world w G Wem, let for(w) = ( /\ a£w a) A ( f\ a ^ w -,«) 
- that is, a formula that is satisfied only by world w. Now 
we can determine the upper and lower bounds on the prob- 
ability of a literal w.r.t. Hem (denoted Pl,x) as follows: 



i L i = EP-LP-MIN n 



EM, 



V f° r i w ) 



w^nec(L) 

u L ^ = EP-LP-MAX ( Hem, \] for(w) ] , 

w£poss(L) 



and 



< Pl.x < u L , 



Hence, P LjX = (Zl,x 



"L,I-tL,I \ _j_ U L,X *~L,Z 



Example 3.4 Following from Example \3.1\ argu- 
ment (A5, isCap(baja, worml23)) , we can compute 



is Cap ( baja , worml23 ) ,X 



( where I = (H' EM ,H A M,af))- Note 
that for the upper bound, the linear program we need to set 
up is as in Example \2.4\ For the lower bound, the objective 
function changes to: mina;3 + xq + X7. From these linear 



constraints, we obtain: P 



is Cap (baja. worm, 123 



x = 0.75 ±0.25. 



4 Attribution Queries 

We now have the necessary elements required to formally 
define the kind of queries that correspond to the attribution 
problems studied in this paper. 

Definition 4.1 Let X = (Hem, Ham, af) be an LnCA 
framework, S C C ac t (the set of "suspects"), O G C ops 
(the "operation"), and £ C Gem (the "evidence"). An ac- 
tor A G S is said to be a most probable suspect iff there does 
not exist A G S such that P, 



%dOp(A' ,o),x 



idOp(A,0).X' 



where X' = (Hem U Hs,Ham , o,f') with Hg defined as 

Given the above definition, we refer to Q = (X,S,0,£) 
as an attribution query, and A as an answer to Q. We note 
that in the above definition, the items of evidence are added 
to the environmental model with a probability of 1. While 
in general this may be the case, there are often instances 
in analysis of a cyber-operation where the evidence may be 
true with some degree of uncertainty. Allowing for proba- 
bilistic evidence is a simple extension to Definition 14. 1 1 that 
does not cause any changes to the results of this paper. 

To understand how uncertain evidence can be present in 
a cyber-security scenario, consider the following. In Syman- 
tec's initial analysis of the Stuxnet worm, they found the 
routine designed to attack the S7-417 logic controller was 



incomplete, and hence would not function |18j . However, in- 
dustrial control system expert Ralph Langner claimed that 
the incomplete code would run provided a missing data 
block is generated, which he thought was possible [19]. In 
this case, though the code was incomplete, there was clearly 
uncertainty regarding its usability. This situation provides 
a real- world example of the need to compare arguments - 
in this case, in the worlds where both arguments are valid, 
Langner's argument would likely defeat Symantec's by gen- 
eralized specificity (the outcome, of course, will depend on 
the exact formalization of the two). Note that Langner 
was later vindicated by the discovery of an older sample, 
Stuxnet 0.5, which generated the data blockd 

InCA also allows for a variety of relevant scenarios to the 
attribution problem. For instance, we can easily allow for 
the modeling of non-state actors by extending the available 
constants - for example, traditional groups such as Hezbol- 
lah, which has previously wielded its cyber- warfare capabil- 
ities in operations against Israel pp. Likewise, the InCA can 
also be used to model cooperation among different actors 
in performing an attack, including the relationship between 
non-state actors and nation-states, such as the potential 
connection between Iran and militants stealing UAV feeds in 
Iraq, or the much-hypothesized relationship between hack- 
tivist youth groups and the Russian government pp. An- 
other aspect that can be modeled is deception where, for 
instance, an actor may leave false clues in a piece of mal- 
ware to lead an analyst to believe a third party conducted 
the operation. Such a deception scenario can be easily cre- 
ated by adding additional rules in the AM that allow for 
the creation of such counter-arguments. Another type of 
deception that could occur include attacks being launched 
from a system not in the responsible party's area, but under 
their control (e.g., see [5]). Again, modeling who controls a 
given system can be easily accomplished in our framework, 
and doing so would simply entail extending an argumenta- 
tion line. Further, campaigns of cyber-operations can also 
be modeled, as well as relationships among malware and/or 
attacks (as detailed in [20]). 

As with all of these abilities, InCA provides the analyst 
the means to model a complex situation in cyber-warfare 
but saves him from carrying out the reasoning associated 
with such a situation. Additionally, InCA results are con- 
structive, so an analyst can "trace-back" results to better 
understand how the system arrived at a given conclusion. 

5 Conclusion 

In this paper we introduced InCA, a new framework that 
allows the modeling of various cyber- warfare/cyber-security 
scenarios in order to help answer the attribution question 
by means of a combination of probabilistic modeling and ar- 
gumentative reasoning. This is the first framework, to our 
knowledge, that addresses the attribution problem while al- 
lowing for multiple pieces of evidence from different sources, 
including traditional (non-cyber) forms of intelligence such 
as human intelligence. Further, our framework is the first 

2 http: //www. Symantec, com/connect/blogs/stuxnet-05- disrupting- 
uranium-proccssing-natanz 



to extend Defeasible Logic Programming with probabilis- 
tic information. Currently, we are implementing InCA and 
the associated algorithms and heuristics to answer these 
queries. We also feel that there are some key areas to ex- 
plore relating to this framework, in particular: 

• Automatically learning the EM and AM from data. 

• Conducting attribution decisions in near real time. 

• Identifying additional evidence that must be collected 
in order to improve a given attribution query. 

• Improving scalability of InCA to handle large datasets. 

Future work will be carried out in these directions, focusing 
on the use of both real and synthetic datasets for empirical 
evaluations. 
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